This page last changed on May 04, 2012 by wlapka.

This page describes the process to install and configure SAM-Gridmon node type from scratch.

IMPORTANT
The SAM-Gridmon node type is not officially supported. These are the installation instructions for the latest SAM release (SAM-Update-17), which is an internal release, so please don't install it. You can find information about previous releases in the Installing SAM-Gridmon page of the SAM Administrator's Guide.
Before following these instructions, make sure you have completed the development basic setup process (sections 1 to 3.2 of the Development Process guide)

Environment

Disabled selinux in /etc/selinux/config

SELINUX=disabled
You must restart the box if you change this variable

Requirements

You need to install host certificate in order to secure the Nagios web portal. Certificate should be placed on the standard location:

ls -l /etc/grid-security/host*
-rw-r--r-- 1 root root 2286 Oct 28 19:26 /etc/grid-security/hostcert.pem
-r-------- 1 root root  887 Oct 28 19:25 /etc/grid-security/hostkey.pem
/etc/grid-security directory must have 755 permission and the certificate must have SSL client attribute
openssl x509 -in /etc/grid-security/hostcert.pem -noout -purpose | grep "SSL client"
SSL client : Yes

If you plan to use the SAM DB (i.e. NCG_TOPOLOGY_USE_SAM or NCG_REMOTE_USE_SAM set to true) you need to request access to SAM PI from your Nagios host. Details on enabling access are maintained by the SAM team here. In the request you should provide the machine address(es) and simply specify that you require access under the "EGEE-SA1 Monitoring Profile".

Repositories

Install YUM and rpmforge packages:

Remove the old lcg-CA repository, if installed:

  • rm -f /etc/yum.repos.d/lcg-CA.repo

Repositories List

Configure the following repositories:

  • EGI CAs (egi-trustanchors.repo)
  • gLite BDII (glite-BDII.repo)
  • gLite UI (glite-UI.repo)
  • DAG (dag.repo)
    [dag]
    name=DAG (http://dag.wieers.com) add-on packages, no formal support from CERN
    baseurl=http://linuxsoft.cern.ch/dag/redhat/el5/en/$basearch/dag
    gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-dag
    gpgcheck=1
    enabled=0
    protect=0
  • SAM (rpm with repo files)
    EGI sites should use this repository instead
    [egi-sam]
    name=EGI SAM repo
    baseurl=http://repository.egi.eu/sw/production/sam/1/$basearch
    enabled=1
    gpgcheck=0
    protect=1
    priority=10

Repository Priorities

Install yum-priorities:

yum install yum-priorities

Modify repository files:

glite-UI repo can have higher priority than rpmforge
  • rpmforge.repo
    ### Name: RPMforge RPM Repository for RHEL 5 - dag
    ### URL: http://rpmforge.net/
    [rpmforge]
    name = RHEL $releasever - RPMforge.net - dag
    baseurl = http://apt.sw.be/redhat/el5/en/$basearch/rpmforge
    mirrorlist = http://apt.sw.be/redhat/el5/en/mirrors-rpmforge
    mirrorlist = file:///etc/yum.repos.d/mirrors-rpmforge
    enabled = 1
    protect = 0
    gpgkey = file:///etc/pki/rpm-gpg/RPM-GPG-KEY-rpmforge-dag
    gpgcheck = 1
    priority=11
    exclude=libyaml,python-django
    
    [rpmforge-extras]
    name = RHEL $releasever - RPMforge.net - extras
    baseurl = http://apt.sw.be/redhat/el5/en/$basearch/extras
    mirrorlist = http://apt.sw.be/redhat/el5/en/mirrors-rpmforge-extras
    mirrorlist = file:///etc/yum.repos.d/mirrors-rpmforge-extras
    enabled = 1
    protect = 0
    gpgkey = file:///etc/pki/rpm-gpg/RPM-GPG-KEY-rpmforge-dag
    gpgcheck = 1
    priority=11
    
    [rpmforge-testing]
    name = RHEL $releasever - RPMforge.net - testing
    baseurl = http://apt.sw.be/redhat/el5/en/$basearch/testing
    mirrorlist = http://apt.sw.be/redhat/el5/en/mirrors-rpmforge-testing
    mirrorlist = file:///etc/yum.repos.d/mirrors-rpmforge-testing
    enabled = 0
    protect = 0
    gpgkey = file:///etc/pki/rpm-gpg/RPM-GPG-KEY-rpmforge-dag
    gpgcheck = 1
  • sa1-centos5-release.repo
    EGI sites should use EGI SAM repository described above.
    [egee-sa1]
    name=EGEE Packages from SA1 for CentOS5
    baseurl=http://rpm.hellasgrid.gr/mash/centos5-egee/$basearch
            http://www.sysadmin.hep.ac.uk/rpms/egee-SA1/centos5/$basearch
    enabled=1
    gpgcheck=0
    protect=1
    priority=10
    metadata_expire=1
  • slc5-extras.repo
    [slc5-extras]
    name=Scientific Linux CERN 5 (SLC5) add-on packages, no formal support
    baseurl=http://linuxsoft.cern.ch/cern/slc5X/$basearch/yum/extras/
    gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-cern
           file:///etc/pki/rpm-gpg/RPM-GPG-KEY-jpolok
    gpgcheck=1
    enabled=1
    protect=1
    priority=1
  • slc5-os.repo
    [slc5-os]
    name=Scientific Linux CERN 5 (SLC5) base system packages
    baseurl=http://linuxsoft.cern.ch/cern/slc5X/$basearch/yum/os/
    gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-cern
           file:///etc/pki/rpm-gpg/RPM-GPG-KEY-jpolok
           file:///etc/pki/rpm-gpg/RPM-GPG-KEY-csieh
           file:///etc/pki/rpm-gpg/RPM-GPG-KEY-dawson
    gpgcheck=1
    enabled=1
    protect=1
    priority=1
    exclude=php*,perl-DBI,MySQL-python,c-ares,perl-DBD-MySQL
  • slc5-updates.repo
    [slc5-updates]
    name=Scientific Linux CERN 5 (SLC5) bugfix and security updates
    baseurl=http://linuxsoft.cern.ch/cern/slc5X/$basearch/yum/updates/
    gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-cern
           file:///etc/pki/rpm-gpg/RPM-GPG-KEY-jpolok
           file:///etc/pki/rpm-gpg/RPM-GPG-KEY-csieh
           file:///etc/pki/rpm-gpg/RPM-GPG-KEY-dawson
    gpgcheck=1
    enabled=1
    protect=1
    priority=1
    exclude=php*,perl-DBI,MySQL-python,c-ares,perl-DBD-MySQL

Installation

yum install lcg-CA httpd subversion
yum --enablerepo=slc5-cernonly -y install oracle-instantclient-basic oracle-instantclient-sqlplus oracle-instantclient-tnsnames.ora perl-DBD-Oracle
yum install sam-gridmon

Install the cx_Oracle python bindings from http://cx-oracle.sourceforge.net

wget "http://prdownloads.sourceforge.net/cx-oracle/cx_Oracle-5.1.1-10g-py24-1.x86_64.rpm?download"
rpm -i cx_Oracle-5.1.1-10g-py24-1.x86_64.rpm

Make sure sqlplus works. If not, you may need to add the Oracle home to your library path.

$ echo /usr/lib64/oracle/10.x.x.x/client/lib64 >> /etc/ld.so.conf.d/oracle-instantclient.conf
$ ldconfig

Database Deployment

SAM-Gridmon requires an oracle database. The deploy the required schema the following steps should be performed:

Please make sure the database account you are using has grant privileges to execute any job class and create scheduler jobs, and has explicit grant to create tables (not coming from role).
  1. Checkout oracle database tools
    svn co https://svn.cern.ch/reps/sam/trunk/devel-scripts/databases/oracle
    cd oracle
  2. Drop all database objects
    sqlplus <db_account@db_service/db_pass> @drop_dbschema.sql
  3. Create tablespace for MRS table
    Only execute in production instances. This step should be revised and introduce the tablespace creation inside MRS.
    sqlplus <db_account@db_service/db_pass> @create_tablespace.sql
  4. Create database schema.
    perl deploy_dbschema.pl
    <db_account@db_service/db_pass>
  5. Recreate synonyms (only for oracle account with writers and readers)
    sqlplus <db_account@db_service/db_pass> @get_synonyms.sql
    sqlplus <db_account_R@db_service_read/db_pass/read> @recreate_synonyms.sql
    sqlplus <db_account_W@db_service_write/db_pass_write> @recreate_synonyms.sql
  6. Grant Privileges (only for oracle account with writers and readers)
    sqlplus <db_account@db_service/db_pass> @grant_privileges.sql
  7. Run ATP Synchronizer
    Set your database connection settings in /etc/atp/atp_db.conf
    atp_synchro -d /etc/atp/atp_db.conf -c /etc/atp/atp_synchro.conf -l /etc/atp/atp_logging_files.conf
  8. Run MDDB Synchronizer
    ssh root@mddb-central
    cd /opt/<node>
    vi etc/databases.yml (check that DB is ok)
    ./sync_to_<node>
  9. Bootstrapping MRS
    mkdir -p /tmp/oracle_bootstrap
    cd /tmp/oracle_bootstrap
    svn co https://svn.cern.ch/reps/sam/trunk/mrs/DBScripts/oracle_bootstrap/
    cd oracle_bootstrap
    sqlplus <db_account@db_service/db_pass> @OSG_Bootstrapper.sql
    sqlplus <db_account@db_service/db_pass> @supported_profiles.sql
  10. Bootstrapping ACE
    mkdir -p /tmp/oracle_bootstrap
    cd /tmp/oracle_bootstrap
    svn co https://svn.cern.ch/reps/sam/trunk/ace/schema/oracle/ver_?_?/
    cd ver_?_?
    sqlplus <db_account@db_service/db_pass> @add_algorithms.sql

Configuration

The configuration of all SAM-Gridmon boxes is based on YAIM (See custom YAIM configuration).

The following variables must be set.

Edit YAIM configuration file:

# GENERIC
SITE_NAME=CERN-PROD
BDII_HOST=lcg-bdii.cern.ch

VOS="dteam ops"
VO_DTEAM_VOMS_SERVERS='vomss://voms.hellasgrid.gr:8443/voms/dteam?/dteam/'
VO_DTEAM_VOMSES="'dteam voms.hellasgrid.gr 15004 /C=GR/O=HellasGrid/OU=hellasgrid.gr/CN=voms.hellasgrid.gr dteam 24' 'dteam voms2.hellasgrid.gr 15004 /C=GR/O=HellasGrid/OU=hellasgrid.gr/CN=voms2.hellasgrid.gr dteam 24'"
VO_DTEAM_VOMS_CA_DN="'/C=GR/O=HellasGrid/OU=Certification Authorities/CN=HellasGrid CA 2006' '/C=GR/O=HellasGrid/OU=Certification Authorities/CN=HellasGrid CA 2006'"
VO_OPS_VOMS_SERVERS="vomss://voms.cern.ch:8443/voms/ops?/ops/"
VO_OPS_VOMSES="'ops lcg-voms.cern.ch 15009 /DC=ch/DC=cern/OU=computers/CN=lcg-voms.cern.ch ops 24' 'ops voms.cern.ch 15004 /DC=ch/DC=cern/OU=computers/CN=voms.cern.ch ops 24'"
VO_OPS_VOMS_CA_DN="'/DC=ch/DC=cern/CN=CERN Trusted Certification Authority' '/DC=ch/DC=cern/CN=CERN Trusted Certification Authority'"
RB_HOST=skurut2.cesnet.cz # irelevant, RB is unsupported
VO_DTEAM_WMS_HOSTS="wms204.cern.ch wms205.cern.ch" # put to your NGI WMSes
VO_OPS_WMS_HOSTS="wms204.cern.ch wms205.cern.ch" # put to your NGI WMSes

# DATABASE
DB_TYPE=oracle
DB_NAME=<db_name>
DB_USER=<db_user>
DB_PASS=<db_pass>

# MESSAGING
MSG_CONSUME2DB_TYPE="non-durable"
MS_CONSUMER_NAME="<client id>" # hostname with dots replaced by underscores, e.g. grid_monitoring_test_cern_ch"
MSG_BROKER_CACHE_HOST="sam-validation.msg.cern.ch"

# NAGIOS
NAGIOS_ADMIN_DNS="/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=wlapka/CN=623537/CN=Wojciech Lapka,/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=straylen/CN=613539/CN=Steve Traylen,/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=jshade/CN=468767/CN=John Shade,/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=jamesc/CN=380618/CN=James Casey,/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=kskaburs/CN=658461/CN=Konstantin Skaburskas,/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=dcollado/CN=496848/CN=David Collados Polidura,/C=IT/O=INFN/OU=Personal Certificate/L=Roma 1/CN=Alessandro Di Girolamo,/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=asciaba/CN=430796/CN=Andrea Sciaba,/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=sciaba/CN=430796/CN=Andrea Sciaba,/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=pmendez/CN=477458/CN=Patricia Mendez Lorenzo,/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=mpaladin/CN=696692/CN=Massimo Paladin,/DC=org/DC=doegrids/OU=People/CN=Vikas Singhal 692459,/O=Grid/O=NorduGrid/OU=ndgf.org/CN=Anders Rhod Gregersen,/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=magini/CN=577890/CN=Nicolo Magini,/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=girolamo/CN=614260/CN=Alessandro Di Girolamo,/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=santinel/CN=564059/CN=Roberto Santinelli,/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=mbabik/CN=555091/CN=Marian Babik"
NAGIOS_ROLE="central-web"
NAGIOS_HTTPD_ENABLE_CONFIG=true

# ATP
ATP_VO_FEEDS="<list of VOs>"
ATP_VO_FEED_<vo1>="<vo feed url>"
ATP_VO_FEED_<vo2>="<vo feed url>"

# MYWLCG
ENABLE_MYWLCG_ALIAS=1
MYWLCG_DB_LIMIT=50000
MYWLCG_ACCESS_PERIOD=5
MYWLCG_NUMBER_OF_ACCESSES=100
MYEGI_ADMIN_NAME=Admin Name
MYEGI_ADMIN_EMAIL=it-dep-gt-tom-services@cern.ch
MYEGI_DEFAULT_PROFILE=ROC
MYEGI_ACE="True"

# POEM
POEM_ADMIN_NAME="Admin Name"
POEM_ADMIN_EMAIL="sam-nagios-val@cern.ch"

# MDDB
NCG_MDDB_SUPPORTED_PROFILES="ROC,ROC_CRITICAL,ROC_OPERATORS,GLEXEC"

# OTHERS
DAEMON_USER="edguser"
DAEMON_GROUP="edguser"

You may need to change your database connection settings, as well as the admin data.

Run YAIM:

/opt/glite/yaim/bin/yaim -s site-info.def -c -n glite-NAGIOS_WEB

In order for MRS bootstrapping to work correctly it is necessary to have all metric configurations in /etc/ncg-metric-config.d from all SAM-Nagioses for which we compute availabilities (the standard OPS metrics are already in /etc/ncg-metric-config.conf, so any additional configurations need to be generated and uploaded according to NCG guide).

Additional Configuration

Throttling of MyWLCG WEB API

Performance limits in MyWLCG/MyEGI portal are set by YAIM variables.
This variables have default values as listed below:

# Limit number of rows that can be fetched at a time to avoid DB dumps.
MYWLCG_DB_LIMIT=50000
# Limit number of accesses per IP address in a given time(seconds).
MYWLCG_ACCESS_PERIOD=5
MYWLCG_NUMBER_OF_ACCESSES=100

Validation

After successful running of YAIM you should be able to access SAM-Girdmon web interface at the address https://SAMGRIDMON_SERVER/mywlcg or https://SAMGRIDMON_SERVER/myegi.

Known Issues

When using yum to upgrade a machine from Update-11 to Update-12 the following exclude option is required for the yum command:

yum --exclude=egee-NAGIOS update

Problems

A description of common problems when installing SAM can be found at the Troubleshooting section.

Document generated by Confluence on Feb 27, 2014 10:19